Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for transfer of Storage resources, such as PersistentVolumeClaims or VolumeSnapshots #2326

Closed

Conversation

huffmanca
Copy link
Contributor

@huffmanca huffmanca commented Jan 25, 2021

Attempts to continue the work done by @j-griffith in 643 and 1112 and @mhenriks in 1555 and 2110 to create a generic API for transferring storage resources across namespaces.

xref #2324

/sig storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 25, 2021
name: foo
namespace: source-namespace
kind: VolumeSnapshot
approvalName: approve-foo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does a transfer request know approvalName? does this gets filled later on by snapshot controller?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's part of the agreement. User who gives the snapshots must pass this name to the user who accepts the snapshots, outside of Kubernetes API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I'm checking OpenStack, which has very similar concept called volume transfer between projects, and here OpenStack actually generates a token, that the current owner of the volume (old project) must give to the new owner (new project).

The KEP should note that the name should be at least something that the users can't guess. Using a random string in the example would be helpful.

requestName: transfer-foo
```

When matching `StorageTransferRequest` and `StorageTransferApproval` resources are detected, the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should describe what the matching criteria is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matching criteria has been included, and was updated based on the meeting today. As of now the StorageTransferApproval indicates the storage source, and a match is determined only between the StorageTransferRequest and StorageTransferApproval values. It no longer validates that the source matches between the two.

targetName: bar
```

`StorageTransferApproval` resource will be created in the target namespace:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huffmanca Could you explain more about the StorageTransferApproval and StorageTransferRequest, do they need to be created manually?
If yes, IMO, it's a little complicated to use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qinpingli,

The current design requires that these objects be manually created. It's possible that users could script creation of them if they plan to transfer resources en masse; however, we intentionally need a user from each namespace to "sign off" on the transfer.

namespace: target-namespace
spec:
source:
name: foo

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the user he/she can access the target namespace, but can not access the source namespace, how he/she can get the volumesnapshot object he/she wants to transfer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the proposal it says "provided that both the source and target namespace have a user that has agreed to the transfer." By creating objects both in the source and target namespace we ensure this agreement actually happened. So users can't steal snapshots from other namespaces and at the same time they can't "push" their snapshots to namespaces that don't want it.

@xing-yang
Copy link
Contributor

@huffmanca can you change "VolumeSnapshot" in the title to "PVC or VolumeSnapshot"?

name: transfer-foo
namespace: target-namespace
spec:
source:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's really in the source namespace, you can use TypedLocalObjectReference with APIGroup, Kind and Name. The namespace is implied from the namespace of StorageTransferRequest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we also use an ObjectReference in both the request and approval, and ensure that these match?

@huffmanca huffmanca changed the title Add proposal for transfer of VolumeSnapshot Add proposal for transfer of Storage resources, such as PersistentVolumeClaims or VolumeSnapshots Mar 4, 2021
@j-griffith
Copy link
Contributor

Looking between #2110 and this to match things up and was thinking, maybe it would make sense to consolidate to a single KEP that proposes a "csi transfer" feature. That way efforts can be focused on a single KEP and controller. Initial thoughts on this it seems like it might be feasible.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: huffmanca
To complete the pull request process, please assign wojtek-t, xing-yang after the PR has been reviewed.
You can assign the PR to them by writing /assign @wojtek-t @xing-yang in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@huffmanca
Copy link
Contributor Author

@j-griffith ,

I've updated this KEP, but this is attempting to continue the work that you and mhenricks have been leading, since the most recent KEP has stalled. The goal would be to have a generic (for storage) API that can be used for both PVCs and VolumeSnapshots, at least to start with.

I agree that we're looking to create a single controller for both.

spec:
approvalName: xyji6OwynW
targetName: bar
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding something like a token here that's required on the target side, I believe @jsafrane touched on this already. Even if it was just an optional field that could be specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've included this in both resources' spec. I think it best if the names were human readable, the token required, and automatically generated.

- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot. If this was a dynamically provisioned VolumeSnapshotContent, then it adjusts the VolumeSnapshotContent.Spec.Source to point to the newly created
VolumeSnapshot.
- Deletes the source VolumeSnapshot.

Copy link
Contributor

@j-griffith j-griffith Mar 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"transfering" and relabeling snapshots has always been something that made me kinda nervous. For a number of storage backends that use things like cow type snapshots or linked reference to versions this introduces a pretty big problem. The problem is that you could now have a snapshot in ns: foo that's actually linked to a snapshot in ns: bar. (eg sequential snapshots are linked: snap-'n' --> snap-3 --> snap-2 --> snap-1 --> Volume) The two namespaces aren't aware of this linkage on the back end and neither can delete resources. So what happens when I transfer snap-2?

I know some have proposed that backends handle the reference counting in this case and do a sort of 'soft delete' but that doesn't seem great because "somebody" is paying for that storage that they think they deleted.

To transfer snapshots I don't have a great alternative proposal, other than create a PVC from a snapshot and transfer that instead. I'm probably overly paranoid about this though and I don't think there are any other folks that have this concern.

@j-griffith
Copy link
Contributor

Thanks @huffmanca for the clarification on some of those points. If #2110 is being combined here a lot of that content should probably be merged in here. The one other thing I was getting at was the level of detail around the workflow here https://github.com/kubernetes/enhancements/pull/2110/files#diff-82617246992480a1d7acd7c0041813170f4e01cb996df54b3530e2db74ed9bcfR95 was really useful. Other than that I'm obviously in favor of the feature :) Really happy to see you've picked up the effort.

@mkimuram
Copy link
Contributor

mkimuram commented Apr 29, 2021

@huffmanca @j-griffith @gnufied @jsafrane

I'm thinking about the other approach to achieve this feature.
The idea is to add fields to PVC, instead of introducing new StorageTransferRequest API and StorageTransferApproval API.

I believe that this simplifies the Spec and avoid declaring the desired state of a certain resource in different multiple places.

What do you think of it?

Details are as follows:

The fields to be added are transfer.source and transfer.destination in PVC spec. Also, some types need to be added to status.conditions.type.

spec:
  transfer:
    source:
      namespace: ""
      name: ""
    destination:
      namespace: ""
      name: ""
status: 
  conditions: 
    type: add [transferring|receiving|transferred]

For example, let's assume that user A has PVC pvc-a in namespace ns-a and would like to transfer it to PVC pvc-b in namespace ns-b that user B uses.

User operations will be:

  • User A: modify pvc-a to set spec.transfer.destination.namespace=ns-b and spec.transfer.destination.name=pvc-b (And may delete pvc-a later, after it is transferred).
  • User B: create pvc-b with setting spec.transfer.source.namespace=ns-a and spec.transfer.source.name=pvc-a.

Controller behaviors will be:

  • If pvc.status.phase == "Bound" && no pod is using this PVC && spec.transfer.destination is set: add "transferring" to pvc.status.conditions.type,
  • If pvc.status.conditions.type has "transferring": Avoid consuming this PVC by pods,
  • If pvc.status.phase == "Pending" && spec.transfer.destination is set && pvc.status.conditions.type doesn't have "receiving": add "receiving" to pvc.status.conditions.type,
  • If pvc.status.phase == "Pending" && spec.transfer.destination is set && pvc.status.conditions.type has "receiving": check if the PVC with spec.transfer.source.namespace and spec.transfer.source.name has pvc.status.conditions.type "transferring":
    • then: Move the bidirectional reference of PVC and PV from source PVC to destination PVC (Secrets in PV should be modified here, by using information from StorageClass referenced from destination PVC. Also, validations for the compatibility of PVCs need to be done before start swapping).
      • If succeeded:
        • source PVC: Change pvc.status.phase to "Lost" and change pvc.status.conditions.type from "transferring" to "transferred",
        • destination PVC: Change pvc.status.phase to "Bound" and remove "receiving" from pvc.status.conditions.type.
    • else: do nothing (Wait for destination PVC become "transferring").

Above is just a rough idea that doesn't consider cancel or error cases. Also, it doesn't cover snapshot. However, we will be able to consider more in detail, once we agree that this approach looks good (I hope that this approach can be easily applied to snapshot, like adding similar fields in VolumeSnapshot and just move bidirectional reference from/to VolumeSnapshotContent from source to destination).

@mkimuram
Copy link
Contributor

mkimuram commented May 3, 2021

Implemented PoC codes for further discussion.
Normal case works as expected.

Console logs are as follows:

kubectl create ns ns-a
kubectl create ns ns-b

cat << EOF | kubectl create -n ns-a -f - 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-a
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
EOF

cluster/kubectl.sh describe -n ns-a pvc pvc-a
Name:          pvc-a
Namespace:     ns-a
StorageClass:  standard
Status:        Bound
Volume:        pvc-18304fee-908d-4a59-8587-2b3d64166923
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/host-path
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type    Reason                 Age   From                         Message
  ----    ------                 ----  ----                         -------
  Normal  ProvisioningSucceeded  6s    persistentvolume-controller  Successfully provisioned volume pvc-18304fee-908d-4a59-8587-2b3d64166923 using kubernetes.io/host-path

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM        STORAGECLASS   REASON   AGE
pvc-18304fee-908d-4a59-8587-2b3d64166923   5Gi        RWO            Delete           Bound    ns-a/pvc-a   standard                20s
  • User A: modify pvc-a to set spec.transfer.destination.namespace=ns-b and spec.transfer.destination.name=pvc-b
kubectl patch -n ns-a pvc pvc-a -p '{"spec":{"transfer": {"destination": {"namespace": "ns-b", "name": "pvc-b"}}}}'

cluster/kubectl.sh describe -n ns-a pvc pvc-a
Name:          pvc-a
Namespace:     ns-a
StorageClass:  standard
Status:        Bound
Volume:        pvc-18304fee-908d-4a59-8587-2b3d64166923
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/host-path
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Transfer:
  Destination:
    Namespace:  ns-b
    Name:       pvc-b
Used By:        <none>
Conditions:
  Type           Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----           ------  -----------------                 ------------------                ------  -------
  Transferring   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 03 May 2021 14:51:33 +0000           Waiting for receiver PVC to receive the bound PV and block consuming this PVC.
Events:
  Type    Reason                 Age   From                         Message
  ----    ------                 ----  ----                         -------
  Normal  ProvisioningSucceeded  31s   persistentvolume-controller  Successfully provisioned volume pvc-18304fee-908d-4a59-8587-2b3d64166923 using kubernetes.io/host-path
  • User B: create pvc-b with setting spec.transfer.source.namespace=ns-a and spec.transfer.source.name=pvc-a.
cat << EOF | kubectl create -n ns-b -f - 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-b
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  transfer:
    source:
      namespace: ns-a
      name: pvc-a
EOF

cluster/kubectl.sh describe -n ns-b pvc pvc-b
Name:          pvc-b
Namespace:     ns-b
StorageClass:  standard
Status:        Pending
Volume:        pvc-18304fee-908d-4a59-8587-2b3d64166923
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      0
Access Modes:  
VolumeMode:    Filesystem
Transfer:
  Source:
    Namespace:  ns-a
    Name:       pvc-a
Used By:        <none>
Conditions:
  Type        Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----        ------  -----------------                 ------------------                ------  -------
  Receiving   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 03 May 2021 14:52:15 +0000           Receiving transferred volume.
Events:       <none>

cluster/kubectl.sh describe -n ns-b pvc pvc-b
Name:          pvc-b
Namespace:     ns-b
StorageClass:  standard
Status:        Bound
Volume:        pvc-18304fee-908d-4a59-8587-2b3d64166923
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      5Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Transfer:
  Source:
    Namespace:  ns-a
    Name:       pvc-a
Used By:        <none>
Events:         <none>
  • User A: (And may delete pvc-a later, after it is transferred).
cluster/kubectl.sh describe -n ns-a pvc pvc-a
Name:          pvc-a
Namespace:     ns-a
StorageClass:  standard
Status:        Lost
Volume:        
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/host-path
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Transfer:
  Destination:
    Namespace:  ns-b
    Name:       pvc-b
Used By:        <none>
Conditions:
  Type          Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----          ------  -----------------                 ------------------                ------  -------
  Transferred   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 03 May 2021 14:52:54 +0000           Transferring volume has been completed.
Events:
  Type    Reason                 Age    From                         Message
  ----    ------                 ----   ----                         -------
  Normal  ProvisioningSucceeded  2m12s  persistentvolume-controller  Successfully provisioned volume pvc-18304fee-908d-4a59-8587-2b3d64166923 using kubernetes.io/host-path
  Normal  ClaimLost              25s    persistentvolume-controller  Bound PersistentVolume has been transfered. This PersistentVolumeClaim can safely be deleted.

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM        STORAGECLASS   REASON   AGE
pvc-18304fee-908d-4a59-8587-2b3d64166923   5Gi        RWO            Delete           Bound    ns-b/pvc-b   standard                2m38s

@mkimuram
Copy link
Contributor

mkimuram commented May 3, 2021

@saad-ali

I read through the previous discussions and noticed that I'm just trying to roll back the discussion here by my previous two comments.

Do we really need to introduce new generic CRDs for transfer, instead of just adding fields for transfer to specific APIs that support this feature?

I think:

  • From a user experience perspective, it would be easier to learn less APIs, unless the spec differs significantly between APIs. Also, users would expect that they can directly change a resource with the API for the resource.
  • From a semantic perspective, PersistentVolumeClaim is already an abstraction that requests a PersistentVolume with a certain specification, therefore it would be the right place to specify which PersistentVolume to bound / unbound from (And this will be also true for VolumeSnapshot/VolumeSnapshotContent).
  • We would be able to later introduce new generic transfer APIs that just wrap this implementation, if needed (Actually, we will need to introduce similar fields to PVC, even if we introduce additional CRDs outside, so that we can keep track of the status of PVCs. For example, if there are two StorageTransferRequest referencing the same source PVC and different destination PVCs or if StorageTransferRequest is deleted while transferring, keeping the status just in StorageTransferRequest won't be enough).


For VolumeSnapshots, this performs the following:
- Creates a VolumeSnapshot in the target namespace.
- Binds the VolumeSnapshotContent to the newly created VolumeSnapshot. If this was a dynamically provisioned VolumeSnapshotContent, then it adjusts the VolumeSnapshotContent.Spec.Source to point to the newly created
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VolumeSnapshotContent.Spec.Source points to a VolumeHandle or a SnapshotHandle. It does not point to a VolumeSnapshot. It is VolumeSnapshotContent.Spec.VolumeSnapshotRef that points to a VolumeSnapshot. Also VolumeSnapshotContent.Spec.VolumeSnapshotRef is immutable so you cannot rebind it.

The suggested approach is to create a new VolumeSnapshotContent, point Source to the existing SnapshotHandle, and point VolumeSnapshotRef to the new VolumeSnapshot.

CC @yuxiangqian

these resources between namespaces, allowing users to conveniently move these resources without manual creation
of each independent object (VolumeSnapshot, VolumeSnapshotContent).

This extends the [API for PVC Namespace Transfer](https://github.com/kubernetes/enhancements/pull/2110/files) to also apply for VolumeSnapshots.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this KEP only handles VolumeSnapshot namespace transfer. The KEP on PVC (#2110) is closed. Is there anyone working on PVC namespace transfer?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, based on the tile of this KEP, I closed #2110 but can of course reopen. I am still very interested in this feature getting into k8s.

VolumeSnapshot.
- Deletes the source VolumeSnapshot.

The external-snapshotter will need to be updated to remove immutability of the following fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should remove the immutability of these fields. Instead a new VolumeSnapshotContent object should be created.

CC @yuxiangqian

@xing-yang
Copy link
Contributor

@mkimuram In your POC, you are proposing to add a Transfer.Source and Transfer.Destination field in the PVC spec. At the end of the transfer, will a new PVC be created with the name and namespace based on the Transfer.Destination field?

@mkimuram
Copy link
Contributor

mkimuram commented Jul 1, 2021

At the end of the transfer, will a new PVC be created with the name and namespace based on the Transfer.Destination field?

No. New PVC is manually created by a user who have access to the namespace. It can be regarded as the same action for the receiver to create a StorageTransferAccept in the current KEP, from the UX view point. Transfer only happens after the new PVC is created manually. Please click the "Details" in the previous comment and see the console log. In the example, User B creates a new PVC.

@mkimuram
Copy link
Contributor

mkimuram commented Jul 2, 2021

@xing-yang

I hope that this diagram will help understand my intention.

Note that while PVC's condition is transferred, the attach/detach controller is modified to prevent PVC with phase:bound from being consumed. Therefore, it is safe even if the PV is bound to two PVCs. Also, either PVC A or PVC B have reference to the PV at any time and transfer can only happens with two PVCs, no more than two.

@xing-yang
Copy link
Contributor

xing-yang commented Jul 12, 2021

We had a meeting to discuss this today. @Elbehery summarized all previous efforts and the two options. Meeting notes are here: https://docs.google.com/document/d/17H1k4lqdtJwZSjNRaQue-FhMhyk14JA_MoURpoxha5Q/edit#

Meeting recording is here: https://www.youtube.com/watch?v=7VMp8Asf_P8

- `r.spec.acceptName == a.metadata.name`
- `r.spec.token == a.spec.requestToken`

Once matching is complete, a controller will begin the transfer process. For PVCs, this performs the following:
Copy link
Contributor

@mkimuram mkimuram Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhenriks @huffmanca

What happens if

  • two StorageTransferRequests are created for the same resource and StorageTransferAccept for each StorageTransferRequest is created almost at the same time,
  • two StorageTransferAccepts for the same StorageTransferRequest are created almost at the same time in different namespaces,
  • two StorageTransferRequests that have the same source.kind and targetName are created, and StorageTransferAccepts for both of them are created in the same namespace,
  • a StorageTransferRequest is deleted before transfer completes,
  • a StorageTransferAccept is deleted before transfer completes,
  • a resource with the same criteria (namespace, name, and so on) in StorageTransferRequest is created after a transfer for the intended resource is completed (do users need to ensure that StorageTransferRequest and StorageTransferAccept are deleted after a successful transfer)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two StorageTransferRequests are created for the same resource and StorageTransferAccept for each StorageTransferRequest is created almost at the same time,
two StorageTransferAccepts for the same StorageTransferRequest are created almost at the same time in different namespaces,
two StorageTransferRequests that have the same source.kind and targetName are created, and StorageTransferAccepts for both of them are created in the same namespace,

I think these are all basically the same issue. The transfer controller will process the first match it encounters and ignore others. An annotation on the source can probably help us with the ignore case. For applications with possibility of contention like this would probably be wise to have a standard naming scheme for requests/accepts and handle collisions that way

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a StorageTransferRequest is deleted before transfer completes,
a StorageTransferAccept is deleted before transfer completes,

I imagine a pending and running phase for a transfer. During the pending phase (waiting for no pods using the source, etc) a transfer is easily aborted. Once a transfer is running, things should happen pretty quickly and finalizars can make sure the process completes. However, if an error is encountered in the running phase, the transfer can be aborted and the system remains in the current state (no rollback)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a resource with the same criteria (namespace, name, and so on) in StorageTransferRequest is created after a transfer for the intended resource is completed (do users need to ensure that StorageTransferRequest and StorageTransferAccept are deleted after a successful transfer)?

Requests/Accepts should be deleted by the user or perhaps only an admin so that proper auditing occurs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhenriks

Thank you for sharing your ideas. Can this information be added to the KEP so that we can discuss more on further corner cases?

On changing the spec, below points would be worth considering.

  • What information is needed in annotations to mark PVC as transferring?
    (One annotation that has identities for both StorageTransferRequest and StorageTransferAccept, or two separate annotations for each of them?)
  • Do we need to add finalizer to both StorageTransferRequest and StorageTransferAccept to have a chance to properly remove the annotations on a transferring PVC for cancelation?
  • Should we consider adding targetNamespace to StorageTransferRequest.Spec?
    (Current spec seems to allow one StorageTransferRequest to have multiple target namespace candidates, as a result, above annotation logic is likely to become complex. If we change the spec this way, we can focus on one annotation that has identity for only StorageTransferRequest.)
  • Do StorageTransferRequest and StorageTransferAccept need to have Status fields to manage a pending and running phase?
    (And possibly, should we consider adding completed phase to avoid running multiple times after a successful transfer?)
  • Can we consider a way to rollback even after it becomes running phase?

Also, an example for the further corner cases is how to block/work with operations by other controllers, like provisioning, attaching, snapshotting, resizing, etc ... (Do we just make other controllers understand the annotations or consider adding status fields to PVC for this purpose? And if we choose the latter, should information in above annotations also be moved to status fields?)

@mkimuram
Copy link
Contributor

mkimuram commented Jul 15, 2021

@bswartz

As for updating secretRefs for PV, I've implemented a prototype on top of the previous POC commits. The idea is to modify a PV based on its SC's template before it is bound to the destination PV (This means that it only works well for dynamically provisioned case, not for manually provisioned case).

Please check the last two commits of this branch.

Current codes, or the last commit, blindly allow updating PV's secretRefs. However, we will be able to add fields, like transferring stautus/conditions or previousPVC, to PV and only allow updating secretRefs when such fields have expected values. Or we may be able to find a good timing that some of the PV's fields can be used to identify that the PV is clearly transferring.

Tested with below commands, please also double check:

  • Create storageClass which uses templates for secretRefs
cat << 'EOF' | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-hostpath-sc-with-secret
provisioner: hostpath.csi.k8s.io
parameters:
  csi.storage.k8s.io/controller-publish-secret-namespace: ${pvc.namespace}
  csi.storage.k8s.io/controller-publish-secret-name: ${pvc.name}
  csi.storage.k8s.io/node-stage-secret-namespace: ${pvc.namespace}
  csi.storage.k8s.io/node-stage-secret-name: ${pvc.name}-${pv.name}
  csi.storage.k8s.io/node-publish-secret-namespace: default
  csi.storage.k8s.io/node-publish-secret-name: ${pvc.namespace}-${pvc.name}-${pv.name}
  csi.storage.k8s.io/controller-expand-secret-namespace: default
  csi.storage.k8s.io/controller-expand-secret-name: fixed-name-secret
EOF
  • Create namespaces to test
kubectl create ns ns-a
kubectl create ns ns-b
  • Create PVC pvc-a in namespace ns-a
cat << 'EOF' | kubectl apply -n ns-a  -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-a
spec:
  storageClassName: csi-hostpath-sc-with-secret
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
EOF
  • Check the PV name for PVC pvc-a
kubectl get pvc pvc-a -n ns-a -o jsonpath='{.spec.volumeName}'
pvc-a5dfd2c3-e5fe-4bbe-8410-d7c6a8222678
  • Check the PV's secretRefs
kubectl get pv $(kubectl get pvc pvc-a -n ns-a -o jsonpath='{.spec.volumeName}') -o json | jq '.spec.csi'
{
  "controllerExpandSecretRef": {
    "name": "fixed-name-secret",
    "namespace": "default"
  },
  "controllerPublishSecretRef": {
    "name": "pvc-a",
    "namespace": "ns-a"
  },
  "driver": "hostpath.csi.k8s.io",
  "fsType": "ext4",
  "nodePublishSecretRef": {
    "name": "ns-a-pvc-a-pvc-a5dfd2c3-e5fe-4bbe-8410-d7c6a8222678",
    "namespace": "default"
  },
  "nodeStageSecretRef": {
    "name": "pvc-a-pvc-a5dfd2c3-e5fe-4bbe-8410-d7c6a8222678",
    "namespace": "ns-a"
  },
  "volumeAttributes": {
    "storage.kubernetes.io/csiProvisionerIdentity": "1626384546874-8081-hostpath.csi.k8s.io"
  },
  "volumeHandle": "c7df5132-e5b3-11eb-9372-0242ac110004"
}
  • Request transfer from namespace ns-a
kubectl patch -n ns-a pvc pvc-a -p '{"spec":{"transfer": {"destination": {"namespace": "ns-b", "name": "pvc-b"}}}}'
  • Accept transfer from namespace ns-b as PVC pvc-b
cat << EOF | kubectl create -n ns-b -f - 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-b
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  transfer:
    source:
      namespace: ns-a
      name: pvc-a
EOF
  • Confirm that PV name for the pvc-b matches the one previously bound to pvc-a
kubectl get pvc pvc-b -n ns-b -o jsonpath='{.spec.volumeName}'
pvc-a5dfd2c3-e5fe-4bbe-8410-d7c6a8222678
  • Confirm that PV's secretRefs are updated based on the transferred PVC's information and tempalte in SC
kubectl get pv $(kubectl get pvc pvc-b -n ns-b -o jsonpath='{.spec.volumeName}') -o json | jq '.spec.csi'
{
  "controllerExpandSecretRef": {
    "name": "fixed-name-secret",
    "namespace": "default"
  },
  "controllerPublishSecretRef": {
    "name": "pvc-b",
    "namespace": "ns-b"
  },
  "driver": "hostpath.csi.k8s.io",
  "fsType": "ext4",
  "nodePublishSecretRef": {
    "name": "ns-b-pvc-b-pvc-a5dfd2c3-e5fe-4bbe-8410-d7c6a8222678",
    "namespace": "default"
  },
  "nodeStageSecretRef": {
    "name": "pvc-b-pvc-a5dfd2c3-e5fe-4bbe-8410-d7c6a8222678",
    "namespace": "ns-b"
  },
  "volumeAttributes": {
    "storage.kubernetes.io/csiProvisionerIdentity": "1626384546874-8081-hostpath.csi.k8s.io"
  },
  "volumeHandle": "c7df5132-e5b3-11eb-9372-0242ac110004"
}

- Wait for all Pods to stop using the PVC source-namespace\foo
- Bind The PersistentVolume to target-namespace\bar.
- Create the PVC target-namespace\bar by copying the spec of source-namespace\foo and setting spec.volumeName appropriately.
- Reset the PersistentVolume reclaim policy
Copy link
Contributor

@mkimuram mkimuram Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to reset it? Who holds the original policy and where it is stored?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 24, 2021
@jsafrane
Copy link
Member

There is a follow-up PR #2849, closing this one
/close

@k8s-ci-robot
Copy link
Contributor

@jsafrane: Closed this PR.

In response to this:

There is a follow-up PR #2849, closing this one
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.